swap memory location into the shared memory location at block 513. At block 515 the 
thread indicates that it has performed the atomic operation (e.g., sets an execution bit to 
true). 

[0048] Figure 6C is a diagram illustrating the thread 603 updating the shared 
memory location 607 according to one embodiment of the invention. In Figure 6C, the 
thread 605 performs the CAS instruction of the program unit 409 to ensure atomicity of 
the memory update operation. In accordance with the example of the CAS instruction 
described above, the thread 603 determines if the shared- value in the shared memory 
location 607 is equal to the compare value in the memory location 609. If the values are 
equal, then atomicity for the memory update operation has not been violated and the 
thread 603 loads the swap value from the memory location 611 into the shared memory 
location 607. When the thread 605 performs the exemplary CAS instruction of the 
program unit 409, the shared- value and the compare value are not equal because the 
thread 603 has updated the shared memory location 607 with the thread's 603 swap value. 
Hence, the thread 605 determines that atomicity has been violated from its perspective 
(assuming that the thread's 603 swap value is different than the shared- value originally 
stored in the shared memory location 607). 

[0049] Figure 6D is a diagram illustrating the thread 605 performing the loop of 
the program unit 409 according to one embodiment of the invention. Since atomicity has 
not been preserved for the thread 605, the thread 605 loads the shared-value, which is the 
thread's 603 swap value, from the shared-memory location 607 into the memory location 
615. The thread 605 will perform the callback routine again to determine its swap value 
and store the swap value into the memory location 617. The thread 605 will repeat this 
loop until the thread 605 is terminated or the thread 605 successfully performs the 
memory update operation atomically. 
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[0050] The present embodiments of the invention provide efficient, scalable, 
parallel atomic operations. A single instantiation of the present embodiments of the 
invention can implement efficient atomic operations across multiple platforms (i.e., 
variants of hardware architecture, operating system, threading environment, compilers 
and programming tools, utility software, etc.) and yet optimize performance for each 
individual platform. Furthermore, the present embodiments of the invention provide the 
ability to tailor performance and explore performance/implementation-cost trade-offs on 
each of multiple platforms. For example, low-level instruction sets (i.e., assembly code 
instructions) maybe developed to support specified operators, data-types, and/or data- 
type sizes at a reasonable cost while still providing the ability to optimize other operators, 
data-types, and/or data-type sizes differently with the described embodiments of the 
present invention. The present embodiments of the invention provide for using low-level 
instruction sets to fully support a spectrum of operations, data-types, and data-type sizes 
on an individual platform in order to optimize performance on that platform, while still 
providing the ability to optimize performance separately on other platforms. 

[0051] While the invention has been described in terms of several embodiments, 
those skilled in the art will recognize that the invention is not limited to the embodiments 
described. In various embodiments of the invention, program units are translated from 
source code to source code or from source code to object code or assembly language. In 
alternative embodiments of the invention, an interpreter interprets program units, code is 
inlined instead of making calls to a runtime library, etc. 

[0052] Furthermore, the atomic update to a shared memory location can be 
performed by a thread as described, processor, operating system process, etc. 

[0053] The method and apparatus of the invention can be practiced with 
modification and alteration within the spirit and scope of the appended claims. 
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Embodiments of the invention are described with reference to an atomic operation. The 
form of an atomic operation depends on the implementing language. For example, an 
atomic operation may take the form of a directive and a memory update operation, a call 
to a single function defined as an atomic operation, etc. The description is thus to be 
regarded as illustrative instead of limiting on the invention. 
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